Similarity and Location Aware Scalable Deduplication System for Virtual Machine Storage Systems

نویسنده

  • M. Kavitha
چکیده

I.INTRODUCTION In this paper with the potentially unlimited storage space offered by cloud providers, users tend to use a large amount space as they can and vendors continually look for techniques aimed to reduce redundant data and exploit space savings. A technique which has been widely adopted is crossuser deduplication. The simple idea behind deduplication is to accumulate duplicate data only once. Therefore, if a user wants to upload a file which is already store, the cloud provider will insert the user to the owner list of that file. Deduplication has proved deduplication eliminates unneeded data segments from the backup and reduces the size of toalize high space and cost savings and many cloud storage providers are currently adopt it. the backup data. This is particularly useful in Cloud Storage Where data is transferred to the storage target over WAN. Deduplication with Cloud Storage not only reduces the storage space requirements, but also reduces the data that is transferred over the network resulting in earlier and capable of data protection operations. In Direct Deduplication to cloud, the cloud storage is defined as the storage target in the Media Agent. Deduplication is enabling either at the client or at the Media Agent. The deduplicated data is transferred to Cloud Storage library. The deduplication database resides on the Media Agent (or on a designated volume that is attached to the Media Agent). Deduplication can be enabled for derivative copies during Storage Policy Copy creation. In this setup, data on the prime copy is not deduplicated. The non-deduplicated data can be deduplicated by enabling deduplication on the Abstract:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boafft: Distributed Deduplication for Big Data Storage in the Cloud

As data progressively grows within data centers, the cloud storage systems continuously facechallenges in saving storage capacity and providing capabilities necessary to move big data within an acceptable time frame. In this paper, we present the Boafft, a cloud storage system with distributed deduplication. The Boafft achieves scalable throughput and capacity usingmultiple data servers to dedu...

متن کامل

Merging Similarity and Trust Based Social Networks to Enhance the Accuracy of Trust-Aware Recommender Systems

In recent years, collaborative filtering (CF) methods are important and widely accepted techniques are available for recommender systems. One of these techniques is user based that produces useful recommendations based on the similarity by the ratings of likeminded users. However, these systems suffer from several inherent shortcomings such as data sparsity and cold start problems. With the dev...

متن کامل

Sed: Scalable & Efficient De-duplication File System for Virtual Machine Images

Virtualization is becoming widely deployed in servers to efficiently provide many logically separate execution environments by reducing the demand for physical servers, so this approach reserves physical CPU resources. Nevertheless, it still consumes large amounts of storage because each virtual machine (VM) instance, needs its own multi-gigabyte disk image. Existing systems take efforts to red...

متن کامل

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage

In a virtualized cloud cluster, frequent snapshot backup of virtual disks improves hosting reliability; however, it takes significant memory resource to detect and remove duplicated content blocks among snapshots. This paper presents a low-cost deduplication solution scalable for a large number of virtual machines. The key idea is to separate duplicate detection from the actual storage backup i...

متن کامل

A Scalable Inline Cluster Deduplication Framework for Big Data Protection

Cluster deduplication has become a widely deployed technology in data protection services for Big Data to satisfy the requirements of service level agreement (SLA). However, it remains a great challenge for cluster deduplication to strike a sensible tradeoff between the conflicting goals of scalable deduplication throughput and high duplicate elimination ratio in cluster systems with low-end in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015